The histogram of survival time shows a right-skewed distribution: many victims live with the consequences of breast cancer for several years, while a subset experience earlier “fatal” outcomes. Coloring by vital status highlights that deaths cluster at shorter follow-up times, as we would expect when a dangerous criminal strikes early.
The Kaplan–Meier curves show how survival unfolds over time for each racial group. Early on, survival probabilities remain high, but the curves gradually step downward as more victims succumb. The attached risk table shows how many patients remain “under surveillance” at each time point, reminding the jury that the end of each curve is based on relatively few individuals.
Comparing curves across racial groups reveals apparent disparities in survival. Some groups show consistently lower survival probabilities at comparable times, suggesting that Breast Cancer does not strike all communities equally. These differences motivate treating race as a key covariate in our multivariable survival model to formally assess these disparities while adjusting for other clinical factors.
The Cox proportional hazards model allows us to quantify the impact of various clinical and demographic factors on survival time. The hazard ratios (HR) indicate how the risk of death changes with each covariate, holding all other factors constant. For example, an HR greater than 1 suggests increased risk, while an HR less than 1 indicates a protective effect.
The forest plot visualizes the hazard ratios and their 95% confidence intervals for each covariate in the Cox model. Covariates with confidence intervals that do not cross 1 are considered statistically significant predictors of survival. This visualization helps the jury quickly identify which factors have the most substantial impact on patient outcomes.
The Cox model results provide compelling evidence against the defendant, Breast Cancer. Several clinical and demographic factors significantly influence survival, highlighting the complex interplay of biology and social determinants in this crime. The jury must consider these findings carefully when deliberating on the guilt of the defendant.
## chisq df p
## age 0.1051 1 0.74584
## race 0.9508 2 0.62165
## marital_status 2.9355 4 0.56867
## t_stage 0.2729 3 0.96505
## n_stage 1.6048 2 0.44826
## differentiate 1.8576 3 0.60248
## tumor_size 0.9924 1 0.31915
## estrogen_status 29.3770 1 6.0e-08
## progesterone_status 32.3417 1 1.3e-08
## regional_node_examined 0.0104 1 0.91881
## regional_node_positive 0.0164 1 0.89818
## GLOBAL 51.1608 20 0.00015
To be admissible, the Cox model must satisfy the proportional hazards (PH) assumption—that each covariate’s effect on the hazard is roughly constant over time. We assess this using Schoenfeld residual–based tests. Small p-values raise concerns about PH violation. Those for estrogen and progresterone status are below 0.05, which raises concern about the evidence.
The bootstrap analysis provides robust estimates of hazard ratios and confidence intervals by resampling the data multiple times. This approach helps validate the stability of our Cox model findings, ensuring that the evidence against the defendant is not an artifact of random variation in the sample. The 95% bootstrap confidence were intervals offer additional assurance about the reliability of our estimates.
The 5-fold cross-validation assesses the predictive performance of our Cox model by partitioning the data into training and testing sets. The concordance index (C-index) quantifies how well the model discriminates between patients with different survival outcomes. A C-index closer to 1 indicates excellent predictive ability, while a value around 0.5 suggests no better than random guessing. Our model’s mean C-index of 0.733 (SD 0.051) demonstrates its utility in predicting who is at highest risk of death.